Merging Case Relations into VSM to Improve Information Retrieval Precision

نویسندگان

  • Hongtao Wang
  • Maosong Sun
  • Shaoming Liu
چکیده

This paper presents an approach that merges case relations into the well-known Vector Space Model (VSM), leading to a new model named CVSM (Case relation-based VSM). A Chinese case system with 23 case relations is established, and a Chinese Olympic news corpus of 7,662 sentences, denoted COCS, is constructed by manual annotation with these 23 case relations. We use 50 queries on COCS as a test set. Experimental results on the test set show that C-VSM outperforms W-VSM (Word-based VSM) by 3.4% on the average 11-point precision. It is worth pointing out that almost all the previous studies on semantic IR obtained no better, even worse, results than W-VSM, our work thus validates the usefulness of case relations in IR through the validation is still preliminary. The proposed model is believed to be language-independent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Text Vector Representations for Information Retrieval

This paper suggests a novel representation for documents that is intended to improve precision. This representation is generated by combining two central techniques: Random Indexing; and Holographic Reduced Representations (HRRs). Random indexing uses co-occurrence information among words to generate semantic context vectors that are the sum of randomly generated term identity vectors. HRRs are...

متن کامل

Topic Analysis for Psychiatric Document Retrieval

Psychiatric document retrieval attempts to help people to efficiently and effectively locate the consultation documents relevant to their depressive problems. Individuals can understand how to alleviate their symptoms according to recommendations in the relevant documents. This work proposes the use of high-level topic information extracted from consultation documents to improve the precision o...

متن کامل

The phrase - based vector space model for automatic retrievalof free - text medical documents q Wenlei Mao , Wesley W . Chu

Objective: To develop a document indexing scheme that improves the retrieval effectiveness of free-text medical documents. Design: The phrase-based vector space model (VSM) uses multi-word phrases as indexing terms. Each phrase consists of a concept in the unified medical language system (UMLS) and its corresponding component word stems. The similarity between concepts are defined by their rela...

متن کامل

Fusion of Retrieval Models at CLEF 2008 Ad-Hoc Persian Track

Metasearch engines submit the user query to several underlying search engines and then merge their retrieved results to generate a single list that is more effective to the users’ information needs. According to the idea behind metasearch engines, it seems that merging the results retrieved from different retrieval models will improve the search coverage and precision. In this study, we have in...

متن کامل

Broadening Vector Space Schemes for Improving the Quality of Information Retrieval

The vector space model (VSM) of information retrieval suffers in two areas, it does not utilise term positions and it treats every term as being independent. We examine two information retrieval methods based on the simple vector space model. The first uses the query term position flow within the documents to calculate the document score, the second includes related terms in the query by making...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005